NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Unraveling overoptimism and publication bias in ML-driven science

https://doi.org/10.1016/j.patter.2025.101185

Saidi, Pouria; Dasarathy, Gautam; Berisha, Visar (April 2025, Patterns)

Free, publicly-accessible full text available April 1, 2026
A Label-Efficient Two-Sample Test

Li, Weizhi; Dasarathy, Gautam; Ramamurthy, Karthikeyan; Berisha, Visar (August 2022, Uncertainty in artificial intelligence)

Full Text Available
Digital medicine and the curse of dimensionality

https://doi.org/10.1038/s41746-021-00521-5

Berisha, Visar; Krantsevich, Chelsea; Hahn, P. Richard; Hahn, Shira; Dasarathy, Gautam; Turaga, Pavan; Liss, Julie (December 2021, npj Digital Medicine)

Abstract Digital health data are multimodal and high-dimensional. A patient’s health state can be characterized by a multitude of signals including medical imaging, clinical variables, genome sequencing, conversations between clinicians and patients, and continuous signals from wearables, among others. This high volume, personalized data stream aggregated over patients’ lives has spurred interest in developing new artificial intelligence (AI) models for higher-precision diagnosis, prognosis, and tracking. While the promise of these algorithms is undeniable, their dissemination and adoption have been slow, owing partially to unpredictable AI model performance once deployed in the real world. We posit that one of the rate-limiting factors in developing algorithms that generalize to real-world scenarios is the very attribute that makes the data exciting—their high-dimensional nature. This paper considers how the large number of features in vast digital health data can challenge the development of robust AI models—a phenomenon known as “the curse of dimensionality” in statistical learning theory. We provide an overview of the curse of dimensionality in the context of digital health, demonstrate how it can negatively impact out-of-sample performance, and highlight important considerations for researchers and algorithm designers.
more » « less
Full Text Available
Automated semantic relevance as an indicator of cognitive decline: Out‐of‐sample validation on a large‐scale longitudinal dataset

https://doi.org/10.1002/dad2.12294

Stegmann, Gabriela; Hahn, Shira; Bhandari, Samarth; Kawabata, Kan; Shefner, Jeremy; Duncan, Cayla Jessica; Liss, Julie; Berisha, Visar; Mueller, Kimberly (January 2022, Alzheimer's & Dementia: Diagnosis, Assessment & Disease Monitoring)

Full Text Available
Analyzing the relationship between productivity and human communication in an organizational setting

https://doi.org/10.1371/journal.pone.0250301

Dutta, Arindam; Steiner, Elena; Proulx, Jeffrey; Berisha, Visar; Bliss, Daniel W.; Poole, Scott; Corman, Steven (July 2021, PLOS ONE)
Ruban, Nersisson (Ed.)
Though it is often taken as a truism that communication contributes to organizational productivity, there are surprisingly few empirical studies documenting a relationship between observable interaction and productivity. This is because comprehensive, direct observation of communication in organizational settings is notoriously difficult. In this paper, we report a method for extracting network and speech characteristics data from audio recordings of participants talking with each other in real time. We use this method to analyze communication and productivity data from seventy-nine employees working within a software engineering organization who had their speech recorded during working hours for a period of approximately 3 years. From the speech data, we infer when any two individuals are talking to each other and use this information to construct a communication graph for the organization for each week. We use the spectral and temporal characteristics of the produced speech and the structure of the resultant communication graphs to predict the productivity of the group, as measured by the number of lines of code produced. The results indicate that the most important speech and network features for predicting productivity include those that measure the number of unique people interacting within the organization, the frequency of interactions, and the topology of the communication network.
more » « less
Full Text Available
Estimation of forced vital capacity using speech acoustics in patients with ALS

https://doi.org/10.1080/21678421.2020.1866013

Stegmann, Gabriela M.; Hahn, Shira; Duncan, Cayla J.; Rutkove, Seward B.; Liss, Julie; Shefner, Jeremy M.; Berisha, Visar (July 2021, Amyotrophic Lateral Sclerosis and Frontotemporal Degeneration)

Full Text Available
Compressing LSTM Networks with Hierarchical Coarse-Grain Sparsity

https://doi.org/10.21437/Interspeech.2020-1270

Kadetotad, Deepak; Meng, Jian; Berisha, Visar; Chakrabarti, Chaitali; Seo, Jae-sun (October 2020, Interspeech 2020)
null (Ed.)
Full Text Available
Revisiting the accuracy problem in network analysis using a unique dataset

https://doi.org/10.1016/j.socnet.2020.12.010

Corman, Steven R.; Steiner, Elena; Proulx, Jeffrey D.; Dutta, Arindam; Yahja, Alex; Scott Poole, M.; Berisha, Visar; Bliss, Daniel W. (July 2021, Social Networks)

Full Text Available
An 8.93 TOPS/W LSTM Recurrent Neural Network Accelerator Featuring Hierarchical Coarse-Grain Sparsity for On-Device Speech Recognition

https://doi.org/10.1109/JSSC.2020.2992900

Kadetotad, Deepak; Yin, Shihui; Berisha, Visar; Chakrabarti, Chaitali; Seo, Jae-sun (July 2020, IEEE Journal of Solid-State Circuits)
null (Ed.)
Full Text Available
Finding the Homology of Decision Boundaries with Active Learning

Li, Weizhi; Dasarathy, Gautam; Ramamurthy, Karthikeyan N.; Berisha, Visar (January 2020, Advances in neural information processing systems)
null (Ed.)
Full Text Available

« Prev Next »

Search for: All records